Spotting scientific and technical specialization in biomedical documents using morphological clues
نویسندگان
چکیده
Distinction of the specialization level of the health documents on Internet is an important indication, especially when documents are read by non expert users such as patients. Indeed, a high technicity of documents impedes the patients to understand the content and may have a negative consequence on their health care process and on their communication with medical doctors. When medical portals propose such a distinction, it is obtained further to a human categorisation. We propose an automatic categorization of health documents according to their specialization. We exploit morphological information obtained thanks to the morphological analysis of lexems. The evaluation shows that precision, recall and f-measure are often higher than 90%. MOTS-CLÉS : documents médicaux, spécialisation, apprentissage supervisé, morphologie constructionnelle, sémantique.
منابع مشابه
Connected Component Based Word Spotting on Persian Handwritten image documents
Word spotting is to make searchable unindexed image documents by locating word/words in a doc-ument image, given a query word. This problem is challenging, mainly due to the large numberof word classes with very small inter-class and substantial intra-class distances. In this paper, asegmentation-based word spotting method is presented for multi-writer Persian handwritten doc-...
متن کاملA Region-Based Hashing Approach for Symbol Spotting in Technical Documents
In this paper a geometric hash function able to cluster similar regions and its use for symbol spotting in technical documents is presented. This hashing technique aims to perform a fast spotting process to find candidate locations needing neither a previous segmentation step nor a priori knowledge or learning step.
متن کاملComparative Study between Expert and Non-Expert Biomedical Writings: Their Morphology and Semantics
The amount of health information on the internet is constantly growing but little is done for detecting the technicality level of these documents and guiding of users towards documents which are appropriate to their expertise level. The objective of our work is to propose clues for the automatic distinction between expert and non expert medical documents. More precisely, we propose to study the...
متن کاملTagging gene and protein names in biomedical text
MOTIVATION The MEDLINE database of biomedical abstracts contains scientific knowledge about thousands of interacting genes and proteins. Automated text processing can aid in the comprehension and synthesis of this valuable information. The fundamental task of identifying gene and protein names is a necessary first step towards making full use of the information encoded in biomedical text. This ...
متن کاملAn Investigation on the Causes of a Rotor Bending and its Thermal Straightening (TECHNICAL NOTE)
Distortion or bend in a turbine rotor (especially HIP rotors) may be caused by a number of factors, either singularly or in combination. In general, the causes of rotor bend can be classified invariably in two categories: Rapidly forming permanent rotor bends and/or Slower forming rotor bends, which could trip the turbines’ emergency stop. One of the major modifying solutions for rapid repairin...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- TAL
دوره 52 شماره
صفحات -
تاریخ انتشار 2011